Question Analysis Report

Generated: 2025-07-03T18:20:57.587008

Executive Summary

Dataset Size:
32,400 observations
Features:
478 total
Models Analyzed:
10 outcomes
Best R²:
0.981

Model Performance Summary

Outcome Adj. R² F-statistic F p-value AIC BIC RMSE N Significant Features High VIF Features Mean VIF Max VIF Sample Size
proportion_left_leaning 0.2116 0.2102 147.14 0.0000 214549.7 215052.9 6.6266 24 5 3.13 20.43 32,400
proportion_right_leaning 0.0200 0.0182 11.18 0.0000 105371.6 105874.8 1.2291 29 5 3.13 20.43 32,400
proportion_center_leaning 0.8981 0.8979 4829.53 0.0000 215273.4 215776.5 6.7010 25 5 3.13 20.43 32,400
proportion_high_quality 0.9807 0.9807 27873.57 0.0000 162607.9 163111.0 2.9728 17 5 3.13 20.43 32,400
proportion_low_quality 0.0619 0.0602 36.19 0.0000 162607.9 163111.0 2.9728 17 5 3.13 20.43 32,400
news_proportion_left_leaning 0.1241 0.1225 77.64 0.0000 277959.4 278462.5 17.6305 30 5 3.13 20.43 32,400
news_proportion_right_leaning 0.0217 0.0199 12.15 0.0000 202601.4 203104.5 5.5107 23 5 3.13 20.43 32,400
news_proportion_center_leaning 0.5477 0.5468 663.64 0.0000 306458.4 306961.6 27.3696 33 5 3.13 20.43 32,400
news_proportion_high_quality 0.6237 0.6230 908.50 0.0000 299421.1 299924.3 24.5530 39 5 3.13 20.43 32,400
news_proportion_low_quality 0.0342 0.0325 19.44 0.0000 232369.4 232872.6 8.7241 21 5 3.13 20.43 32,400

Correlation Matrix

Feature Importance

Regression Coefficients by Outcome

proportion_left_leaning (R² = 0.212, 15 significant features)

proportion_right_leaning (R² = 0.020, 15 significant features)

proportion_center_leaning (R² = 0.898, 15 significant features)

proportion_high_quality (R² = 0.981, 15 significant features)

proportion_low_quality (R² = 0.062, 15 significant features)

news_proportion_left_leaning (R² = 0.124, 15 significant features)

news_proportion_right_leaning (R² = 0.022, 15 significant features)

news_proportion_center_leaning (R² = 0.548, 15 significant features)

news_proportion_high_quality (R² = 0.624, 15 significant features)

news_proportion_low_quality (R² = 0.034, 15 significant features)

Model Family Comparisons

proportion_left_leaning

proportion_right_leaning

proportion_high_quality

proportion_news

num_citations

Multicollinearity Diagnostics

Interpretation: Variance Inflation Factor (VIF) measures multicollinearity.

proportion_left_leaning (High VIF: 5, Mean VIF: 3.13)

proportion_right_leaning (High VIF: 5, Mean VIF: 3.13)

proportion_center_leaning (High VIF: 5, Mean VIF: 3.13)

proportion_high_quality (High VIF: 5, Mean VIF: 3.13)

proportion_low_quality (High VIF: 5, Mean VIF: 3.13)

news_proportion_left_leaning (High VIF: 5, Mean VIF: 3.13)

news_proportion_right_leaning (High VIF: 5, Mean VIF: 3.13)

news_proportion_center_leaning (High VIF: 5, Mean VIF: 3.13)

news_proportion_high_quality (High VIF: 5, Mean VIF: 3.13)

news_proportion_low_quality (High VIF: 5, Mean VIF: 3.13)

Summary Statistics

Variable Type Mean Std Min Max N Missing
num_citations Citation Outcome 5.7652 5.1669 0.0000 46.0000 32,400 0
proportion_high_quality Citation Outcome 8.9662 21.3873 0.0000 100.0000 32,400 0
proportion_left_leaning Citation Outcome 1.6659 7.4564 0.0000 100.0000 32,400 0
proportion_right_leaning Citation Outcome 0.0819 1.2404 0.0000 50.0000 32,400 0
news_proportion_high_quality Citation Outcome 21.8282 39.9889 0.0000 100.0000 32,400 0
news_proportion_left_leaning Citation Outcome 4.7865 18.8207 0.0000 100.0000 32,400 0
news_proportion_right_leaning Citation Outcome 0.3746 5.5664 0.0000 100.0000 32,400 0
proportion_news Citation Outcome 10.7818 23.2094 0.0000 100.0000 32,400 0
turn_number Question/Response Feature 1.7057 2.0636 1.0000 39.0000 32,400 0
total_turns Question/Response Feature 2.5335 3.5807 1.0000 50.0000 32,400 0
question_length_chars_log Question/Response Feature -0.0000 1.0000 -3.8234 2.6358 32,400 0
question_length_words_log Question/Response Feature 0.0000 1.0000 -2.2585 2.9189 32,400 0
response_length_log Question/Response Feature -0.0000 1.0000 -7.0885 3.1220 32,400 0
response_word_count_log Question/Response Feature -0.0000 1.0000 -5.6188 2.9660 32,400 0
model_family_google Model Family 7,563 observations 23.3% - - 32,400 0
model_family_openai Model Family 11,168 observations 34.5% - - 32,400 0
model_family_perplexity Model Family 13,669 observations 42.2% - - 32,400 0

Technical Details

Regression Method: OLS_statsmodels

PCA Precomputed: True

PCA Used: True

Total Features: 61